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Abstract 

This paper presents a statistical method to subtract background in maximum hkehhood 
fit, without relying on any separate sideband or simulation for background modeling. 
The method, called sFit, is an extension to the sPlot technique originally developed to 
reconstruct true distribution for each date component. The sWeights defined for the 
sPlot technique allow to construct a modified likelihood function using only the signal 
probability density function and events in the signal region. Contribution of background 
events in the signal region to the likelihood function cancels out on a statistical basis. 
Maximizing this likelihood function leads to unbiased estimates of the fit parameters in 
the signal probability density function. 
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1 Introduction 



The method of maximum hkehhood is a common procedure used for parameter estimation 
in analysis of experimental data. Suppose the probability density function (pdf) P{x; 6) 
with unknown parameters 9 describes a set of N independent measurements Xe- The 
values of 9 that maximize the likelihood function 

N 

L{9) = \[P{x,-9) (1) 

e=l 

are taken to be the estimators for 9. The conventional method to include background in 
maximum likelihood fit requires to write the total pdf as 

P{x- 9, /,) = /,P,(x; ^) + (1 - fs)Pb{x) (2) 

where fg is the fraction of signal events in th data sample, Ps{x; 9) and are the signal 
and background pdf respectively. Usually the background pdf Pb{x) needs to be obtained 
from either Monte Carlo simulation or separate sidebands. The latter requires to divide 
data into signal region and sidebands using discriminating variables y which are supposed 
to be uncorrelated with x for the background component. Some problems may arise with 
this method: Ph{x) may be too complicated to parameterize; the parameterization of 
Ph{x) obtained from simulation may be unreliable; the sidebands may have very different 
Pb{x) distributions from the signal region if they are too far away from the signal region; 
the sidebands may contain a significant signal component if they are too close to the signal 
region. Therefore, it is highly desirable to have an alternative method which does not 
rely on background parameterization from either simulation or separate sidebands. This 
paper provides a solution by generalizing the sPlot technique pQ, originally developed 
to reconstruct true distribution of x for the signal component using sWeights defined as 
functions of y, into a modified maximum likelihood method, called sFit. 



2 The sFit method 

Suppose X are uncorrelated with the discriminating variables ?/, i.e. the distribution of 
X is independent of for both signal and background component^ The data sample 
contains Ng signal events and N^j background events. The distributions of y for signal 
and background are denoted as Fgiy) and Fi,{ii) respectively. We assume that A^^, Ni„ 
Fs{y) and Fb{y) are known. Following the formalism of the sPlot technique, we define a 
sWeight function for the signal component: 

,,, , . V„F,{y) + Vsi,F,{y) 



^This condition is easier to satisfy for a smaller signal region in y. 
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where the matrix V is obtained by inverting the matrix 



N 



y-1 ^ V- Fi{ye)Fj{ye) 

The sWeight for each event, Ws{ye), can be calculated. The basic idea of the sPlot 
technique is that the histogram of Xg weighted by Ws{ye) represents the true distribution 
of X for the signal component, because the background contribution to the histogram 
cancels out due to this choice of Ws{ye)- 

We can extend this idea one step further and define a weighted likelihood function 

N 

Lw{e) = l[[Ps{x,-9)]'^'^y^\ (5) 

e=l 

We expect that background contribution to this likelihood function L^/{6) cancels on a 
statistical basis, therefore 9 can be estimated by maximizing L]y{9). 



3 Application 

We apply the sFit method in a simple case in time-dependent analysis of B decays. The 
signal pdf of proper time t is 

Ps{t;A,T) = Cie-^\1 + Asm{Amt)) (6) 

where Ci is a normalization factor. Am = 17 ps~^ is known, A and F are parameters to 
be determined. Dependence of the pdf on initial flavour of the B meson is not considered 
for simplicity. The background events have a different time distribution 

Pb{t;T,) = C^e-"^'', (7) 

where C2 is a normalization factor, and Th is unknown. The discriminating variable is 
the B mass. The signal events have a known gaussian mass distribution with a standard 
deviation am = 15 MeV and mean mo = 5369 MeV. The background events have a 
known flat distribution. The signal mass window is chosen to be centered at mo. We 
consider scenarios with different half mass window size Sm and different number of signal 
and background events in the signal mass window. 

As an example. Figure [T] shows the distribution of the B mass m^, the total time 
distribution, as well as the signal and background time distributions reconstructed using 
the sWeights, for the scenario Sm/cTm = G, Ng = 5000 and Nb/Ns = 1.5. The Ws{mB) 
function for this scenario is shown in Figure [2] The fact that Ws{mB) has positive 
values in the high signal purity region around m^ and negative values in the low purity 
area illustrates why the background contribution cancels in both the sPlot and the sFit 
methods. 
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Figure 1: Distributions from a data set with Sm/o'm. = G, Ns = 5000 and Nb/N^ = 1.5. 
Left: the B mass distribution; right: the total time distribution (sohd), as well as the signal 
time distribution (dashed) and background time distribution (dot-dashed) reconstructed 
using the sPlot technique. 
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Figure 2: Ws as a function of the B mass rriB for the scenario Sm/cTm = G, Ns = 5000 and 
m/N, = 1.5. 

500 toy data sets are generated for each scenario with 

A = 0.5, r = 0.65 ps"\ Th = 0.8 ps"\ (8) 

We perform fit to each data set using two different methods: the sFit method described 
in Section |2] and a conventional maximum likelihood method for reference based on Equa- 
tion [T] and the total pdf 

P{t; A, r, Fb) = f,Ps{t; A, F)F,(m) + (1 - fs)Pb{t; Fb)Ffe(m) (9) 

where the shape of the background pdf Pb{t;Tf,) is assumed to be known except the 
parameter F;,. 

For each scenario and each fit method, the statistical errors and mean values of the 
parameter A and F are obtained using a single gaussian fit to their estimated values from 
the 500 data sets. An example is shown in Figure|3]and Figure|4]for the scenario Sm/cTm = 
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6, Ns = 5000 and Nb/Ns = 1.5. The results for different scenarios are summarized in tlie 
Table [T] and Table |2] for the sFit method and reference method respectively. 

It can be seen that the sFit method gives unbiased estimates of the fit parameters A 
and r. The statistical errors of the parameter estimates obtained with the sFit method 
are bigger than the errors of the estimates obtained with the reference methods. This 
has two reasons: while the background contribution to Lw cancels, part of the signal 
contribution is also lost due to the negative sWeights in the low purity area, and the size 
of the loss depends on the size of the signal mass window; the cancellation of background 
contribution is not exact due to statistical fluctuation, and the size of the fluctuation 
depends on the background level in the signal region. In general, the larger the signal 
mass window, the smaller the precision difference between the two methods; the lower 
the background level, the smaller the precision difference between the two methods. We 
should keep in mind that the reference method takes full advantage of the knowledge 
of the background time distribution, which is usually unavailable or unreliable in real 
data analysis, therefore the parameter errors in Table |2] are too optimistic and should be 
regarded as lower limits rather than realistic estimates. The sWeight function defined in 
Equation [3] is not the unique way to define event weight function in order to can cancel 
the background contribution to the weighted likelihood function. It would be interesting 
to investigate if the sWeight is the optimal choice of event weight function that minimizes 
the parameter errors. 

Figure |5|and Figuregshow the distributions of {A^'^ - A'^'p^^) / 6 A and (r^**-r*"P"*)/(5r 
for the sFit method and reference method respectively, where 6 A and 6r are the parameter 
errors estimated by the Minuit program according to In L = In Lmax — | • Apparently the 
errors obtained this way using the weighted likelihood function Lw are underestimated, 
because the effect of background fluctuation is not properly accounted for. Reliable error 
estimates can be obtained from Monte Carlo simulation. 
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31.46± 1.72 
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0.4985 ±0.0011 


Sigma 


0.02536 ± 0.00080 
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Prob 


1 


Constant 


94.93 ±5.20 


Mean 


0.6505 ± 0.0006 


Sigma 


0.01261 ±0.00040 




Figure 3: Distributions of the estimated values of A (left) and F (right) obtained with the 
sFit method, with superimposed gaussian fits, for the scenario Sm/cTm = 6, A^^ = 5000 
and Nb/Ns = 1.5. The input values are A = 0.5 and F = 0.65 ps~^. 



4 




A" r" (ps-^) 

Figure 4: Distributions of the estimated values of A (left) and F (right) obtained with 

the reference method, with superimposed gaussian fits, for the scenario Sm/o'm — 6, 
= 5000 and Nb/N^ = 1.5. The input values are A = 0.5 and F = 0.65 ps'^ 





aiA) 


mean of A 


a{r) (ps-^) 


mean of F (ps ^) 


4, 5000, 1 


0.0304 


0.502 


0.0134 


0.6504 


6, 5000, 1.5 


0.0254 


0.498 


0.0126 


0.6504 


4, 5000, 0.5 


0.0243 


0.501 


0.0115 


0.6511 


6, 5000, 0.75 


0.0223 


0.501 


0.0107 


0.6496 



Table 1: Statistical errors and mean values of A and F from 500 fits using the sFit method 
for different scenarios. Errors of the numbers are on the last digits. The input values are 
A = 0.5andF = 0.65ps-^ 





a{A) 


mean of A 


a{V) (ps-i) 


mean of F (ps ^) 


4, 5000, 1 


0.0251 


0.502 


0.0129 


0.6506 


6, 5000, 1.5 


0.0223 


0.500 


0.0124 


0.6502 


4, 5000, 0.5 


0.0215 


0.500 


0.0113 


0.6511 


6, 5000, 0.75 


0.0211 


0.501 


0.0105 


0.6496 



Table 2: Statistical errors and mean values of A and F from 500 fits using the conventional 
maximum likelihood method for different scenarios. Errors of the numbers are on the last 
digits. The input values are A = 0.5 and F = 0.65 ps~^. 
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Prob 


0.9997 


Constant 


17.35 ±0.95 


Mean 


-0.06578 ±0.061 75 


Sigma 


1.38 ±0.04 




Prob 


0.9999 


Constant 


18.25 ±1.00 


Mean 


0.01 944 ± 0.05868 


Sigma 


1.311 ±0.042 




Figure 5: Distributions of (A-^^* - (left) and (r-^^* - P"^'"*)/5r (right) obtained 

with the sFit method, with superimposed gaussian fits, for the scenario Sm/cFm — 6, 
Ns = 5000 and N^/Ns = 1.5. 




Prob 


1 


Constant 


25.88 ±1.42 


iViean 


-0.00576 ± 0.04138 


Sigma 


0.9249 + 0.0293 



{a"'-a"""')ka 



Prob 


0.9172 


Constant 


23.06 ± 1 .26 


Mean 


0.00336 ± 0.04644 


Sigma 


1.038 + 0.033 




Figure 6: Distributions of {A^'^ - A'''P''^)/SA (left) and (F^^* - r™?"'*)/5r (right) obtained 
with the sFit method, with superimposed gaussian fits, for the scenario Sjn/(Tm — 6, 
Ns = 5000 and Nb/N^ = 1.5. 
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4 Conclusions 



The sFit method presented in this paper fully exploits the idea of background cancellation 
for maximum likelihood fit. If the variables x are uncorrelated with the discriminating 
variables y for both signal and background components, one can define an event weight 
function of y which can be used not only to reconstruct the signal distribution of x, but also 
for parameter estimation from the distributions of x in maximum likelihood fit without 
explicitly modeling the background. The likelihood function constructed using the event 
weights and the signal pdf of x is free from background contribution on a statistical basis. 
Maximizing the likelihood function leads to unbiased parameter estimates. This method 
can largely reduce systematic uncertainties due to unreliable background model obtained 
from either sidebands or simulation at a cost of modest increases in the statistical errors. 
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